Sudoku Solver

Robert Morgowicz (rjm448)
Nayanthara Sajan (ns933)
December 15, 2019


Demonstration Video


Objective

The objective of this project was to create an end-to-end system which could solve a physical, on-paper Sudoku puzzle. The system would have to be able to read the physical board, solve the puzzle, and write the solution back onto the board. With the goal of creating an embedded system in mind, we wanted all of this to occur at the press of a button, without the need for keyboard/mouse input or a display to see the completed puzzle.


Introduction

To complete our objective, the system needed to do three main things: Read, Solve, and Write. Reading was accomplished with [TODO: cv and tesseract things go here]. Solving was done in C as opposed to Python, so the board data had to be translated between languages. Once translated, the board was solved using a parallel backtracking algorithm, making use of the RPi’s four cores, before being re-translated back into Python for use in writing. To write the solution, we used an old Houston Instruments 2000 series plotter which we controlled indirectly with the RPi using SPI. We hacked the plotter’s normal pressure gauge input to receive analog inputs from a DAC which received the SPI signals from the Pi. These three components, wrapped in a main function for user control and feedback, constituted the full system.


Design and Testing

Reading

Text

Solving

As mentioned above, the actual solving of the board was done in C. We chose C for this task because we were worried Python would be too slow for the task. This turned out to be a good choice, as the solver was blindingly fast, although we got the sneaking suspicion doing it in Python would have been fine, as reading and writing were far more time-intensive tasks. This was the first part of the final design to be completed, mostly because it was software-only.

The algorithm chose for solving was recursive backtracking. Simply put, backtracking finds an empty space on the board (in this case, ours went in row-major order) and tries to put a digit in that spot. If the digit follows the three Sudoku invariants: 1) only one of that digit in that column, 2) only one of that digit in the row, 3) only one of that digit in its 3x3 grid partition, then the algorithm continues to the next empty spot. If the digit does not satisfy the invariant, the next digit is tried in that space. This continues until the algorithm reaches a case where all nine digits have been tried for a given spot, in which case, it returns to the previous digit, tries the next digit there, and continues. This traversal continues until the end of the board is filled, or the first spot has no valid digit (i.e. the space of possible digits has been fully explored and there is no solution).

The first implementation was a serial one. This was done to have a good baseline for parallel speedup and to test and debug the correctness of the solver before attempting a parallel version. In order to test the solver, we hard-coded a couple boards which could be swapped out in the C function. This would later be replaced with receiving input from Python, but we didn’t have that pipeline in-place yet, so taking a hard-coded board, solving it, and printing the solution to terminal was the extent of pre-integration testing. That being said, testing was relatively straightforward by C standards and a few seg faults and infinite recursions fixed later, the serial solver was ready.

The next task was parallelizing the solver. This task was actually quite simple, since backtracking is an embarrassingly parallelizable algorithm. The parallel structure was to first find the beginning two slots on the board where digits would be placed, then generate jobs with all 81 possibilities for those two digits and place them on a work queue. Then, four threads (one for each core on the Pi) would be spawned and each thread would remove a job from the queue, placing the first two digits in the specified positions, and check the invariant, continuing as normal if the invariant was satisfied. Two digits were chosen to create better load balance for the four cores (i.e. 81 jobs as opposed to only 9 with only one digit). Since only a single solution is desired from this algorithm, a flag was included in the solving code to halt other threads if a solution was found by one.

Parallel testing was also fairly lean. Since each thread had its own board to work with, there were no race conditions to skirt around, and the only necessary point of synchronization was making the queue thread-safe. Since the algorithm was largely the same as serial after the first two digits (indeed, the code was all but copied wholesale), correctness was quite easy to verify. However, in order to evaluate which of these algorithms was better, we included an additional performance test. It is possible to construct puzzles such that the backtracking algorithm has considerable difficulty in solving it. We used puzzles like this as a benchmark for our worst-case time.

Writing

Generic placeholder image

On the output side, there were more components and the amount of iteration and tweaking was far more involved that the other two unit sections. It would not be an exaggeration to say that we were still optimizing the writing system even far into the integration phase. The plotter was temperamental, to use civil terms, and its many idiosyncrasies were, while interesting, often unwelcome. Even so, having any base at all to start from was preferable to building one from scratch.

There were three main components to the writing system: the Pi, the DAC, and the Plotter. A DAC was necessary since the Pi did not have the analog capabilities to interface properly with the plotter’s hacked inputs. The Pi sent digital signals to the DAC over an SPI bus. The DAC translated these SPI inputs into analog voltages in the 0-2.3V range. These analog outputs were then divided through standard resistor dividers to create valid inputs to the plotter and. This was the mode of communication between the Pi and Plotter.

We began with the simple step of verifying the plotter hardware, which consisted of activating it and messing around with the knobs to ensure the servos were still functional. Unfamiliar with this technology, we struggled for a bit even turning the thing on, but with the help of a team from Professor Land’s lab, were able to activate it and find modules to control both x and y. We found there were regions of x past which the plotter’s servos could not retract, but this was not a big concern, as the y axis was shorter to begin with and the usable region was roughly square (as are Sudoku puzzles) as a result.

Confident the plotter was functional, we began the task of hacking its inputs. There are five connections on each plotter input module plug: Power (red), Ground (black), S+ (green), S- (white), and Shield (no jacket). Shield is just a shield to reduce noise on the bundled cable, so it can be grounded along with the normal ground signal. One might expect power would need to be driven, bit, in fact, this is actually a connection sourcing power from the plotter itself. As such, it is unused. Finally, S+ and S- are the two data inputs. The plotter expects two voltages very close by one another which change together. As such, we divided analog inputs as shown in the circuit below.

Generic placeholder image Generic placeholder image

In order to verify this setup would work, we drove the hacked y plotter input with a power supply and observed its behavior. The plotter was inconsistent and difficult to control with this method, but we were able to get it to react to changing the input voltage in most situations, so we continued with this method for the other plotter input. The x input proved even more stubborn than the y, but was controllable, so we continued.

We then began the other half of writing from the Pi side, working out SPI connections to the DAC. SPI was the easy step of writing. Python libraries for SPI on the pi were simple and fairly well-documented, so we had little trouble writing a software component which would compile and test. We did have some hardware trouble figuring out the precise SPI connections which needed to be made to the DAC as the datasheet used somewhat obtuse signal names, but in the end, we were able to hook it up with relative ease and step through the desired voltages.

Thankfully, once we had the DAC outputting and the plotter receiving analog signals, the two halves were mated without incident. However, we now had the task of writing software for the plotter to write digits onto a sudoku grid. We had sudoku grids printed out already for testing the reading, and since the grids we had matched up well with our plotter’s viable ranges, we used those same grids to test writing.

Our basic methodology in software, was to create functions to seek to the top left of specific grid boxes, and write functions to write each of the nine digits from that position. We began with seeking to specific boxes in the grid (in software, specific elements in a 9x9 matrix). This was done by finding the edge value of the main square composing the puzzle we could send over out SPI connection (~4000 for both x and y). Then, based on those values, we defined one-box intervals as 1/9th that max value. Our seek function would take a coordinate (0-8) as in input and write that many ninths of the max value over SPI, causing the plotter to seek to that coordinate.

The software was easy to write but when it came to testing it was much the opposite. This was the first big hurdle of writing. Our temperamental plotter was fine for seeking to y values, but would frequently get stuck when trying to seek to x values. What’s more the x module itself seemed to be faulty, as it would not infrequently re-zero itself and lock up for several minutes. Through rigorous testing, we discovered this locking-up was some kind of mechanical issue, as eventually the zero would realign after running for a while. It was facing this periodic error that led us to creating a script to “calibrate” the plotter. This was really more of a test script to ensure it was seeking properly, but by running the script and exercising the plotter, we could bring the device back into alignment.

Another big issue we had with the x module was that the plotter would not move when presented with voltages changing only minorly. We resolved this early-on ins software by having the plotter seek back to 0 whenever it moved in x, but when it came time to write digits (which needed to be contained in the one box) this was not an option. Faced with this problem and some rather unpleasant squeaking, we decided to finally buckle down and access the internals of the plotter, oiling up every wheel on it we could find. The effect was not seen immediately, but this helped fix our x problems considerably.

Now that it could seek properly, we moved to writing digits. These digits were composed of combinations of segments resembling those of a seven-segment display for ease of constructing them in software and because we did not trust the plotter to move in both x and y at the same time accurately.

This was tested by writing digits at specific values, and also writing them one after another in a row. We went back and forth how to notate ones properly and encountered a few bugs writing the digits. Specifically, digits which crossed in the middle more than once (2,4, and 5) were not properly aligned. Since the digits were distinct (if not very recognizable) we continued on with integration.

Integration

All in all, integration was largely just stitching code from the previous parts together and testing that everything still works. It sounds easy, but integration is always the longest and most painful step. If any later teams happen to be reading this, I highly recommend allocating at least half of your time to integration. It never goes as smoothly as you would like it to. Ever.

On that note, we began by trying to integrate our C solver. This meant taking an array from python, passing it to C, and receiving a C array in python and translating it back. We chose to do the translation on the Python side because C has a tendency to be obtuse and numpy scares me. That being said, the ctypes Python library, which is what we used for this, was somehow even more obtuse than C itself. Passing integer literals is relatively easy, but once it gets to arrays, things get hairy fast. Declaring an array and sending to C was ok, took some debugging, but nothing too terrible. We wanted to pass a 2d array, but constructing a higher dimensional C array in Python was more trouble than it was worth, so we resolved to pass a 1d array of 81 elements and make it 2d once passed to c.

We compiled the code as a C library which was imported into Python this necessitated some changes to the solver, as it used global variables which cause seg faults when called as a library function. These global variables had to be instead passed in the argument struct to the threads. Thankfully, the queue’s locks were not interfered with by being called in a library function (likely because they’re from pthread, which is, itself, a library). Compiling as a library is a little different to compiling C code as an executable, but this was not especially difficult to figure out if you’ve ever used a C compiler before.

Returning the array, on the other hand, was quite obtuse. We had tested this part by modifying a C array with a C function called in Python which worked well, but in this case, we were not modifying the source array (remember, each thread gets its own array copy) so we had to return a pointer to the new array. The way to do this with ctypes, (which we found out after several hours of debugging) is to declare the return type of the c library function as a ctypes array of ctype ints. If the return type is not declared, Python will infer the C pointer it gets to be a Python integer.

All told, I would advise against using ctypes. As stated before, numpy scares me and I haven’t used it, but I highly doubt it’s any worse. Ctypes documentation, much like the library itself, is somehow even more obtuse than C itself. Pointers are hard to understand on their own but making pointers only explicitly accessible through a translation function is, frankly, madenning.

Now that everything was Python-interfaceable, we created a main function which polled for a start button and included a calibration button as an interrupt for our temperamental friend. It also included three GPIO outputs to make the code’s internal state visible and signal errors. This was done with the RPi.GPIO library. The hardware for this GPIO (buttons and LEDs) along with the DAC connections can be seen on the protoboard photo below.

Generic placeholder image

First up was marrying solving and writing. We had no input array, but since the empty grid is a valid, solvable grid, the solver produced a valid solution all the same. We printed this in terminal to make sure it was correct, then, using the functions defined in writing unit developement, scanned through the array and wrote out any digits not already on the board. This meant that immediately after the board was generated, a mask array had to be generated to keep track of which digits hat to be written. We accidentally flipped true and false here and only incremented when a mask was flipped from true to false, which was briefly conusing later on, but once those issues were solved, we observed the plotter could write as expected, if not quite legibly.

Instead of traversing the array in standard row-major order, we decided to snake down the array from left to right then right to left on the next row. This was done to reduce writing time and have the plotter move as little as possible. This was the point at which we got the plotter to write out an entire board start-to-finish. By this time, oiling the parts had had its full effect, and we seldom ever saw the x module foul up. As such, we removed the code to seek back to zero when seeking to a new x position, as it was no longer necessary.

Next up, we integrated the reading code. This was mostly copy-pasing, but there were some complications. Because of certain CV components, we have to move to Python 3. Thankfully, this only broke a couple pieces of existing syntax, so it was not a huge challenge. The output from reading also had to be converted from chars to ints before being passed on.

With our full system in-place, we began optimizing it. We centralized our writing functions to all use a global maximum value in their calculations as discussed above. Before we had these as set values, but having them recomputed in the functions meant we could rescale the entire dimension just by changing the one max value. This greatly simplified calibration.

We also realized that we had been trying to write on the plotter without the pen being properly engaged. We realized this when the y mechanism almost broke violently. After playing around a bit and looking at another plotter for reference, we realized the y mechanism needed additional simple repairs, and that we hadn’t pressed the “pen” button to get into the writing position. Now that writing was a little more consistent, we also changed some of the ratios around to make the digits the plotter wrote smaller so they would better fit in the boxes.

The camera also gave us some trouble, we noticed the image it was returning was rather blurry. We thought this might be because the lens was dirty, but in actuality, this was because the lens was out of focus. Once refocused, our reading worked passably again.

However, reading really reared its head when we tried to run the system without the desktop connected. Something in how the python CV or camera libraries worked broke the system without a display connected. And, since we had no display to figure out why this error was occurring (i.e. see the error in a terminal window), we were pretty stuck as to how to debug it.


Results

Reading

Text

Solving

Our solver performed quite well; it was by far the quickest component of out system. We decided to go with the parallel solver four our final design. While the parallel solver was slower for most human-solvable puzzles with many numbers already filled, this was a difference between milliseconds and tens of milliseconds. While this is a large difference in terms of order of magnitude, the parallel solver was still far quicker than writing even one stroke of a digit.

In addition, the parallel solver had a much lower worst-case runtime. For a puzzle constructed to challenge the backtracking algorithm, the parallel solver completed it in roughly 2 minutes on the RPi, while the serial version took more than four. We did not see the theoretical 4x speedup, likely because of large overhead time and memory overhead of copying the entire board to each thread. Because this was designed to be a semi-real-time system, we chose the parallel solver because of its better worst case runtime.

Writing

Generic placeholder image

Once we had spent several days optimizing the plotter, we had a fairly legible system for writing to the card. It was still rather slow, but it could write readable digits and was even tolerant to some variation in the plotter’s scaling. We also optimized the actual writing of digits such that pen strokes which the plotter had trouble drawing were gone over in the reverse direction. The various iterations of the plotter’s digit writing can be seen below in the appendix.

Integration

Generic placeholder image

With all of our elements assembled, we had a working end-to-end system which could read a sudoku card, solve it, and write the solution on that same card. The system was not without its faults, but the inclusion of LEDs let the user know when there were errors which needed to be corrected. The system would start automatically on the Pi when powered, indicate when it was ready, and, at the press of a button, read, solve, and write to a sudoku card. We also included a button to run the calibration code discussed previously and an emergency stop button if the user wanted to stop in the middle of writing.


Conclusions

We achieved the goal we set out to: to create an end-to-end system which could read a physical sudoku card, solve it, and write the solution on that same card. To be wholly honest, I was surprised at how well it did work once we had everything assembled. Around when we were trying to get the plotter working, I was fairly worried that we simply would not be able to get the kind of precision we wanted out of the plotter. The software component of this project proved far less challenging than the hardware. Much like with writing, we had significant problems with getting oru CV to work, though not quite to the extent of our woes with the plotter. We left the circuit with the voltage dividers and a slot for the DAC attached to the plotter, so if anyone needs a plotter which can be controlled with SPI, it’s probably still kicking around in the 4760 lab somewhere. It will definitely need to be oiled, though.


Future Work

As we outlined in our project proposal, once we had the infrastructure for reading, writing, and solving, we would liked to have added a more interactive mode wherein the user can ask for hints and the plotter would draw a single digit. Whole resolving the board every time would actually have been an issue, it became clear over the course of development that recognizing handwriting like this would require a lot of additional work for reading, so this would probably require a lot more time.

In addition to the interactive mode, we had other ideas for optimizations. Actually being able to calibrate the plotter without editing the code would have been quite useful. The easiest way would be to include potentiometers which could be read with GPIO to influence the constants used to scale x and y. Potentially, we could also use the Pi camera for this, having a script recognize how far the extreme value of x or y is off from the grid lines, and adjusting the scale using image processing.


Appendix

Parts List

Total: $73.00

References

Ctypes Documentation
MCP4822 Datasheet
Spidev Documentation
RPi.GPIO Reference

Many, many Stack Overflow Questions and Answers


The writing portion of this project would not have been possible without the great effort and gracious assistance of the latte machine group at Bruce Lang’s 4760 lab:

Julia Hesterbrink (jh2232)

Nathan Zimmerberg (nhz2)

Linda Alexander (lma227)

Thank you for offering your knowledge and allowing us to build on your hard work.

Work Distribution

Generic placeholder image

Project group picture

Robert Morgowicz

rjm448@cornell.edu

Solving C code and Python integration
Plotter hacking, circuit design, and soldering
Writing Software and debugging
Plotter Testing and Optimization
Main Function
Solve/Write Integration

Nayanthara Sajan

ns933@cornell.edu

Reading software and debugging
Camera Mount
Read/Solve integration

Drawings

Generic placeholder image

Puzzle Gallery

Generic placeholder image

Code Appendix


  #Robert Morgowicz (rjm448) and Nayanthara Sajan (ns933)
  #Wednesday Lab Section
  #Final Project Main code loop

  #imports

  #Reading
  import time
  import cv2
  import operator
  import numpy as np
  from PIL import Image
  import pytesseract
  from picamera import PiCamera


  #Solving
  from ctypes import *
  #Writing
  import spidev
  #Misc
  import RPi.GPIO as GPIO


  #Init
  #GPIO Setup
  GPIO.setmode(GPIO.BCM)
  #Buttons
  GPIO.setup(5, GPIO.IN)
  GPIO.setup(26, GPIO.IN)
  GPIO.setup(16, GPIO.IN)
  #LEDs
  GPIO.setup(6, GPIO.OUT)
  GPIO.setup(13, GPIO.OUT)
  GPIO.setup(19, GPIO.OUT)
  #SPI init
  bus = 0
  device = 0
  spi = spidev.SpiDev()
  spi.open(bus, device)
  # Set SPI speed and mode
  spi.max_speed_hz = 500000
  spi.mode = 0
  # Camera
  camera = PiCamera()
  #-----------------------------------------------------------------------
  #READING CODE: Functions and globals
  #-----------------------------------------------------------------------
  def show_image(img):
  	"""Shows an image until any key is pressed"""
  	cv2.imshow('image', img)  # Display the image
  	cv2.waitKey(0)
  	cv2.destroyAllWindows()  # Close all windows


  def show_digits(digits, colour=255):
  	"""Shows list of 81 extracted digits in a grid format"""
  	rows = []
  	with_border = [cv2.copyMakeBorder(img.copy(), 1, 1, 1, 1, cv2.BORDER_CONSTANT, None, colour) for img in digits]
  	for i in range(9):
  		row = np.concatenate(with_border[i * 9:((i + 1) * 9)], axis=1)
  		rows.append(row)
  	show_image(np.concatenate(rows))

  def pre_process_image(img, skip_dilate=False):
  	"""Smoothing filter"""
  	proc = cv2.GaussianBlur(img.copy(), (9, 9), 0)

  	# Adaptive threshold using 11 nearest neighbour pixels
  	proc = cv2.adaptiveThreshold(proc, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)

  	# Invert colours, so gridlines have non-zero pixel values.
  	# Necessary to dilate the image, otherwise will look like erosion instead.
  	proc = cv2.bitwise_not(proc, proc)

  	if not skip_dilate:
  		# Dilate the image to increase the size of the grid lines.
  		kernel = np.array([[0., 1., 0.], [1., 1., 1.], [0., 1., 0.]])
  		kernel=kernel.astype(np.uint8)
  		proc = cv2.dilate(proc, kernel)

  	return proc


  def find_corners(img):
  	"""Finds the 4 extreme corners of the largest contour in the image."""
  	contours, h = cv2.findContours(img.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)  # Find contours
  	contours = sorted(contours, key=cv2.contourArea, reverse=True)  # Sort by area, descending
  	polygon = contours[0]  # Largest image

  	# Use of `operator.itemgetter` with `max` and `min` allows us to get the index of the point
  	# Each point is an array of 1 coordinate, hence the [0] getter, then [0] or [1] used to get x and y respectively.

  	# Bottom-right point has the largest (x + y) value
  	# Top-left has point smallest (x + y) value
  	# Bottom-left point has smallest (x - y) value
  	# Top-right point has largest (x - y) value
  	bottom_right, _ = max(enumerate([pt[0][0] + pt[0][1] for pt in polygon]), key=operator.itemgetter(1))
  	top_left, _ = min(enumerate([pt[0][0] + pt[0][1] for pt in polygon]), key=operator.itemgetter(1))
  	bottom_left, _ = min(enumerate([pt[0][0] - pt[0][1] for pt in polygon]), key=operator.itemgetter(1))
  	top_right, _ = max(enumerate([pt[0][0] - pt[0][1] for pt in polygon]), key=operator.itemgetter(1))

  	# Return an array of all 4 points using the indices
  	# Each point is in its own array of one coordinate
  	return [polygon[top_left][0], polygon[top_right][0], polygon[bottom_right][0], polygon[bottom_left][0]]


  def distance_between(p1, p2):
  	"""Returns the scalar distance between two points"""
  	a = p2[0] - p1[0]
  	b = p2[1] - p1[1]
  	return np.sqrt((a ** 2) + (b ** 2))


  def crop_and_warp(img, crop_rect):
  	"""Crops and warps a rectangular section from an image into a square of similar size."""

  	# Rectangle described by top left, top right, bottom right and bottom left points
  	top_left, top_right, bottom_right, bottom_left = crop_rect[0], crop_rect[1], crop_rect[2], crop_rect[3]

  	# Explicitly set the data type to float32 or `getPerspectiveTransform` will throw an error
  	src = np.array([top_left, top_right, bottom_right, bottom_left], dtype='float32')

  	# Get the longest side in the rectangle
  	side = max([
  		distance_between(bottom_right, top_right),
  		distance_between(top_left, bottom_left),
  		distance_between(bottom_right, bottom_left),
  		distance_between(top_left, top_right)
  	])

  	# Describe a square with side of the calculated length, this is the new perspective we want to warp to
  	dst = np.array([[0, 0], [side - 1, 0], [side - 1, side - 1], [0, side - 1]], dtype='float32')

  	# Gets the transformation matrix for skewing the image to fit a square by comparing the 4 before and after points
  	m = cv2.getPerspectiveTransform(src, dst)

  	# Performs the transformation on the original image
  	return cv2.warpPerspective(img, m, (int(side), int(side)))


  def infer_grid(img):
  	"""Infers 81 cell grid from a square image."""
  	squares = []
  	side = img.shape[:1]
  	side = side[0] / 9

  	# Note that we swap j and i here so the rectangles are stored in the list reading left-right instead of top-down.
  	for j in range(9):
  		for i in range(9):
  			p1 = (i * side, j * side)  # Top left corner of a bounding box
  			p2 = ((i + 1) * side, (j + 1) * side)  # Bottom right corner of bounding box
  			squares.append((p1, p2))
  	return squares


  def cut_from_rect(img, rect):
  	"""Cuts a rectangle from an image using the top left and bottom right points."""
  	return img[int(rect[0][1]):int(rect[1][1]), int(rect[0][0]):int(rect[1][0])]


  def scale_and_centre(img, size, margin=0, background=0):
  	"""Scales and centres an image onto a new background square."""
  	h, w = img.shape[:2]

  	def centre_pad(length):
  		"""Handles centering for a given length that may be odd or even."""
  		if length % 2 == 0:
  			side1 = int((size - length) / 2)
  			side2 = side1
  		else:
  			side1 = int((size - length) / 2)
  			side2 = side1 + 1
  		return side1, side2

  	def scale(r, x):
  		return int(r * x)

  	if h > w:
  		t_pad = int(margin / 2)
  		b_pad = t_pad
  		ratio = (size - margin) / h
  		w, h = scale(ratio, w), scale(ratio, h)
  		l_pad, r_pad = centre_pad(w)
  	else:
  		l_pad = int(margin / 2)
  		r_pad = l_pad
  		ratio = (size - margin) / w
  		w, h = scale(ratio, w), scale(ratio, h)
  		t_pad, b_pad = centre_pad(h)

  	img = cv2.resize(img, (w, h))
  	img = cv2.copyMakeBorder(img, t_pad, b_pad, l_pad, r_pad, cv2.BORDER_CONSTANT, None, background)
  	return cv2.resize(img, (size, size))


  def find_largest_feature(inp_img, scan_tl=None, scan_br=None):
  	"""
  	Uses the fact the `floodFill` function returns a bounding box of the area it filled to find the biggest
  	connected pixel structure in the image. Fills this structure in white, reducing the rest to black.
  	"""
  	img = inp_img.copy()  # Copy the image, leaving the original untouched
  	height, width = img.shape[:2]

  	max_area = 0
  	seed_point = (None, None)

  	if scan_tl is None:
  		scan_tl = [0, 0]

  	if scan_br is None:
  		scan_br = [width, height]

  	# Loop through the image
  	for x in range(scan_tl[0], scan_br[0]):
  		for y in range(scan_tl[1], scan_br[1]):
  			# Only operate on light or white squares
  			if img.item(y, x) == 255 and x < width and y < height:  # Note that .item() appears to take input as y, x
  				area = cv2.floodFill(img, None, (x, y), 64)
  				if area[0] > max_area:  # Gets the maximum bound area which should be the grid
  					max_area = area[0]
  					seed_point = (x, y)

  	# Colour everything grey (compensates for features outside of our middle scanning range
  	for x in range(width):
  		for y in range(height):
  			if img.item(y, x) == 255 and x < width and y < height:
  				cv2.floodFill(img, None, (x, y), 64)

  	mask = np.zeros((height + 2, width + 2), np.uint8)  # Mask that is 2 pixels bigger than the image

  	# Highlight the main feature
  	if all([p is not None for p in seed_point]):
  		cv2.floodFill(img, mask, seed_point, 255)

  	top, bottom, left, right = height, 0, width, 0

  	for x in range(width):
  		for y in range(height):
  			if img.item(y, x) == 64:  # Hide anything that isn't the main feature
  				cv2.floodFill(img, mask, (x, y), 0)

  			# Find the bounding parameters
  			if img.item(y, x) == 255:
  				top = y if y < top else top
  				bottom = y if y > bottom else bottom
  				left = x if x < left else left
  				right = x if x > right else right

  	bbox = [[left, top], [right, bottom]]
  	return img, np.array(bbox, dtype='float32'), seed_point


  def extract_digit(img, rect, size):
  	"""Extracts a digit (if one exists) from a Sudoku square."""

  	digit = cut_from_rect(img, rect)  # Get the digit box from the whole square

  	# Use fill feature finding to get the largest feature in middle of the box
  	# Margin used to define an area in the middle we would expect to find a pixel belonging to the digit
  	h, w = digit.shape[:2]
  	margin = int(np.mean([h, w]) / 2.5)
  	_, bbox, seed = find_largest_feature(digit, [margin, margin], [w - margin, h - margin])
  	digit = cut_from_rect(digit, bbox)

  	# Scale and pad the digit so that it fits a square of the digit size we're using for machine learning
  	w = bbox[1][0] - bbox[0][0]
  	h = bbox[1][1] - bbox[0][1]

  	# Ignore any small bounding boxes
  	if w > 0 and h > 0 and (w * h) > 100 and len(digit) > 0:
  		return scale_and_centre(digit, size, 4)
  	else:
  		return np.zeros((size, size), np.uint8)


  def get_digits(img, squares, size):
  	"""Extracts digits from their cells and builds an array"""
  	digits = []
  	img = pre_process_image(img.copy(), skip_dilate=True)
  	for square in squares:
  		digits.append(extract_digit(img, square, size))
  	return digits


  def parse_grid(path):
  	original = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
  	processed = pre_process_image(original)
  	corners = find_corners(processed)
  	cropped = crop_and_warp(original, corners)
  	squares = infer_grid(cropped)
  	digit_imgs = get_digits(cropped, squares, 28)
  	#show_digits(digit_imgs)
  	digits = []
  	for img in digit_imgs:
  		GPIO.output(6, GPIO.HIGH)
  		text = pytesseract.image_to_string(cv2.bitwise_not(img), config="--psm 10 ")
  		print(text)
  		GPIO.output(6, GPIO.LOW)
  		valid = True
  		if (text.isdigit()):
  			if not (int(text) in range(1,10)):
  				valid = False
  		elif (text):
  			valid = False
  		else:
  			text = 0

  		if(valid):
  			digits.append(int(text))
  		else:
  			print("Could not recognize board. Please try again. ")
  			return []
  	return digits

  #-----------------------------------------------------------------------
  #SOLVING CODE: Functions and globals
  #-----------------------------------------------------------------------
  #import clib and set return type
  so_file = "/home/pi/Project/solver.so"
  solver = CDLL(so_file)
  solve = solver.solve
  solve.restype = POINTER(c_int * 81)
  #-----------------------------------------------------------------------
  #WRITING CODE: Functions and globals
  #-----------------------------------------------------------------------
  #constants
  min_val = 0b000000000000
  max_val = 0b111111111111
  max_x = 4045
  max_y = 3925
  a_mask = 0b00110000
  b_mask = 0b10110000

  #global variables
  prev_x = 0b000000000000
  prev_y = 0b000000000000
  curr_x = 0b000000000000
  curr_y = 0b000000000000

  #SPI writing functions
  #NOTE: Y is the ROW direction for the array, X is the COLUMN direction
  def write_x(val):
  	msg = [ a_mask | (val >> 8), val & 0xff]
  	spi.xfer2(msg)

  def write_y(val):
  	msg = [ b_mask | (val >> 8), val & 0xff]
  	spi.xfer2(msg)

  def seek_y(y):
  	global curr_y
  	if (y > 8):
  		curr_y = 0b111111111111
  	elif (y < 0):
  		curr_y = 0
  	else:
  		curr_y = (int(max_y/9))*y
  	write_y(curr_y)

  def seek_x(x):
  	global curr_x
  	if (x > 8):
  		curr_x = 0b111111111111
  	elif (x < 0):
  		curr_x = 0
  	else:
  		curr_x = (int(max_x/9))*x
  	write_x(curr_x)

  def de_grid():
  	global curr_x
  	global curr_y
  	curr_x = curr_x + int(max_x/6/9)
  	curr_y = curr_y + int(max_y/4/9)
  	write_x(curr_x)
  	write_y(curr_y)
  	time.sleep(.25)

  def cross_right_y():
  	global curr_y
  	curr_y = curr_y + int(2*max_y/4/9)
  	write_y(curr_y)
  	time.sleep(.5)

  def cross_left_y():
  	global curr_y
  	curr_y = curr_y - int(2*max_y/4/9)
  	write_y(curr_y)
  	time.sleep(.5)


  def half_box_down_x():
  	global curr_x
  	curr_x = curr_x + int(2*max_x/6/9)
  	write_x(curr_x)
  	time.sleep(.5)

  def half_box_up_x():
  	global curr_x
  	curr_x = curr_x - int(2*max_x/6/9)
  	write_x(curr_x)
  	time.sleep(.5)


  def full_box_down_x():
  	global curr_x
  	curr_x = curr_x + int(4*max_x/6/9)
  	write_x(curr_x)
  	time.sleep(.5)

  def full_box_up_x():
  	global curr_x
  	curr_x = curr_x - int(4*max_x/6/9)
  	write_x(curr_x)
  	time.sleep(.5)

  def reset_x():
  	global curr_x
  	write_x(min_val)
  	time.sleep(1)
  	write_x(curr_x)
  	time.sleep(1)

  def re_grid():
  	global curr_x
  	global curr_y
  	curr_x = curr_x - int(max_x/6/9)
  	curr_y = curr_y - int(max_y/4/9)
  	if curr_x < 0:
  		curr_x = 0
  	if curr_y < 0:
  		curr_y = 0
  	write_x(curr_x)
  	write_y(curr_y)
  	time.sleep(.25)

  def draw(n):
  	global curr_y
  	de_grid()
  	if (n == 1):
  		curr_y = curr_y + 140
  		write_y(curr_y)
  		time.sleep(.5)
  		half_box_down_x()
  		half_box_down_x()
  		half_box_up_x()
  		half_box_up_x()
  		curr_y = curr_y - 140
  		write_y(curr_y)
  		time.sleep(.5)
  	elif (n==2):
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		half_box_down_x()
  		cross_right_y()
  		cross_left_y()
  		half_box_up_x()
  		cross_right_y()
  		half_box_up_x()
  		cross_left_y()
  	elif (n==3):
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		cross_right_y()
  		full_box_up_x()
  		cross_left_y()
  	elif (n==4):
  		half_box_down_x()
  		cross_right_y()
  		half_box_up_x()
  		half_box_down_x()
  		half_box_down_x()
  		half_box_up_x()
  		cross_left_y()
  		half_box_up_x()
  	elif (n==5):
  		cross_right_y()
  		cross_left_y()
  		half_box_down_x()
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		cross_right_y()
  		half_box_up_x()
  		cross_left_y()
  		half_box_up_x()
  	elif (n==6):
  		cross_right_y()
  		cross_left_y()
  		half_box_down_x()
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		half_box_up_x()
  		half_box_down_x()
  		full_box_up_x()
  	elif (n==7):
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		cross_right_y()
  		half_box_down_x()
  		full_box_up_x()
  		cross_left_y()
  	elif (n==8):
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		cross_right_y()
  		half_box_down_x()
  		cross_left_y()
  		half_box_up_x()
  		half_box_down_x()
  		half_box_up_x()
  		half_box_up_x()
  		half_box_down_x()
  		half_box_up_x()
  	elif (n==9):
  		cross_right_y()
  		half_box_down_x()
  		half_box_down_x()
  		cross_left_y()
  		cross_right_y()
  		half_box_up_x()
  		cross_left_y()
  		half_box_up_x()
  		half_box_down_x()
  		half_box_up_x()
  	else:
  		print("cannot write digit")
  	re_grid()

  #ISR for Calibration
  def calibrate_func(channel):
  	#Turn on LEDS
  	global calibrating
  	calibrating = True
  	#wait for digit to finish writing
  	time.sleep(2)
  	GPIO.output(6, GPIO.HIGH)
  	GPIO.output(19, GPIO.HIGH)
  	#Go To Extreemes
  	write_x(min_val)
  	print(0)
  	time.sleep(1)
  	write_x(max_x)
  	print(max_val)
  	time.sleep(1)
  	write_x(min_val)
  	print(0)
  	time.sleep(1)

  	write_y(min_val)
  	print(0)
  	time.sleep(1)
  	write_y(max_y)
  	print(max_val)
  	time.sleep(1)
  	write_y(min_val)
  	print(min_val)
  	time.sleep(1)
  	draw(9)
  	time.sleep(1)
  	#Increment up
  	i = 1
  	while i < 9:
  		print(i)
  		seek_x(i)
  		reset_x()
  		time.sleep(1)
  		seek_y(i)
  		time.sleep(1)
  		draw(i)
  		time.sleep(.5)
  		i = i + 1
  	#Increment down
  	i = 8
  	while i > -1:
  		seek_x(i)
  		reset_x()
  		print(i)
  		time.sleep(1)
  		seek_y(i)
  		time.sleep(1)
  		i = i - 1
  	write_y(min_val)
  	write_x(min_val)
  	time.sleep(1)
  	#Turn off LEDS
  	GPIO.output(6, GPIO.LOW)
  	GPIO.output(19, GPIO.LOW)
  	calibrating = False
  #ISR for quitting mid-solve
  def redo_func(channel):
  	global redo
  	redo = True
  #create event for calibrate
  GPIO.add_event_detect(5, GPIO.FALLING, callback=calibrate_func,bouncetime = 300)
  GPIO.add_event_detect(16, GPIO.FALLING, callback=redo_func,bouncetime = 300)
  calibrating = False
  run_forever = True
  redo = False
  #main loop
  while(run_forever):
  	#poll for start button
  	GPIO.output(13, GPIO.HIGH)
  	redo = False
  	if not GPIO.input(26):
  		GPIO.output(6, GPIO.LOW)
  		GPIO.output(13, GPIO.LOW)
  		GPIO.output(19, GPIO.LOW)
  		#---------------------------------------------------------------
  		#READ
  		#---------------------------------------------------------------
  		GPIO.output(6, GPIO.HIGH)
  		#camera.start_preview()
  		time.sleep(10)
  		camera.capture('img.jpg')
  		#camera.stop_preview()
  		board = parse_grid('img.jpg')
  		if len(board) == 0:
  			GPIO.output(6, GPIO.HIGH)
  			GPIO.output(13, GPIO.HIGH)
  			GPIO.output(19, GPIO.HIGH)
  			continue
  		print(board)
  		#should end with 1d 81 length list of python ints
  		#board = [0 for i in range(81)]
  		GPIO.output(6, GPIO.LOW)
  		#---------------------------------------------------------------
  		#SOLVE
  		#---------------------------------------------------------------
  		GPIO.output(13, GPIO.HIGH)
  		#create 2d boolean mask for inital array
  		mask = [[False for i in range(9)] for j in range(9)]
  		c = 0
  		while (c < 81):
  			if board[c] == 0:
  				print(c)
  				mask[int(c/9)][c%9] = True
  			c = c + 1
  		print("going to c")
  		#gen c-readable array
  		arr = (c_int * 81)(*board)
  		#solve and save returned array
  		arr2 = (solver.solve(arr))
  		#create 2d python array out of solved array
  		write_board = [[0 for i in range(9)] for j in range(9)]
  		j = 0
  		for i in arr2.contents:
  			write_board[int(j/9)][j%9] = i
  			j = j + 1
  		print(mask)
  		print(write_board)
  		GPIO.output(13, GPIO.LOW)
  		#---------------------------------------------------------------
  		#WRITE
  		#---------------------------------------------------------------
  		GPIO.output(19, GPIO.HIGH)
  		#reset x and y and prime pen
  		#Go To Extreemes
  		write_x(min_val)
  		print(0)
  		time.sleep(1)
  		write_x(max_x)
  		print(max_val)
  		time.sleep(1)
  		write_x(min_val)
  		print(0)
  		time.sleep(1)

  		write_y(min_val)
  		print(0)
  		time.sleep(1)
  		write_y(max_y)
  		print(max_val)
  		time.sleep(1)
  		write_y(min_val)
  		print(min_val)
  		time.sleep(1)
  		#for each row
  		a = 0
  		b = 0
  		while (a < 9):
  			#even rows go left -> right
  			if a%2 == 0:
  				b = 0
  				while(b < 9):
  					while calibrating:
  						pass
  					if redo:
  						continue
  					if mask[a][b]:
  						print("writing "+str(write_board[a][b])+" to "+str(a)+", "+str(b))
  						seek_x(a)
  						time.sleep(.5)
  						seek_y(b)
  						time.sleep(.5)
  						draw(write_board[a][b])
  					b = b + 1
  			#odd rows go right -> left
  			else:
  				b = 8
  				while(b > -1):
  					while calibrating:
  						pass
  					if redo:
  						continue
  					if mask[a][b]:
  						print("writing "+str(write_board[a][b])+" to "+str(a)+", "+str(b))
  						seek_x(a)
  						time.sleep(.5)
  						seek_y(b)
  						time.sleep(.5)
  						draw(write_board[a][b])
  					b = b - 1
  			#reset x after each row
  			#reset_x()
  			a = a + 1
  		seek_x(0)
  		time.sleep(1)
  		seek_y(0)
  		time.sleep(1)
  		GPIO.output(19, GPIO.LOW)
  		#END OF BUTTON CODE

  	time.sleep(.1)
  	GPIO.output(13, GPIO.HIGH)
  	time.sleep(.1)

  GPIO.cleanup()
  spi.close()


                #include 
                #include 
                #include 
                #include 
                #include 
                #include 
                #define BILLION 1000000000L

                #define Q_SIZE 81

                bool f3 = false;
                pthread_mutex_t q_mutex = PTHREAD_MUTEX_INITIALIZER;
                int p_found = -1;

                struct job
                {
                	int v1;
                	int v2;
                };

                struct queue
                {
                	struct job **buffer;
                	int start;
                	int end;
                	int c;
                	int all_produced;
                };

                typedef struct
                {
                	int pid, x1, x2, y1, y2;
                	int **board;
                	struct queue *q;
                } GM;

                struct job *get_work(struct queue *q)
                {
                	pthread_mutex_lock(&q_mutex);
                	if (!q->c)
                	{
                		pthread_mutex_unlock(&q_mutex);
                		return NULL;
                	}
                	struct job *temp = q->buffer[q->start];
                	q->start = (q->start + 1) % Q_SIZE;
                	q->c--;
                	pthread_mutex_unlock(&q_mutex);
                	return temp;
                }

                void add_work(struct queue *q, struct job *toAdd)
                {
                	while (q->c == Q_SIZE)
                		;
                	pthread_mutex_lock(&q_mutex);
                	q->end = (q->end + 1) % Q_SIZE;
                	q->buffer[q->end] = toAdd;
                	q->c++;
                	pthread_mutex_unlock(&q_mutex);
                }

                //helper function, verifies the board still complies with the invariant
                bool verify_placement(int i, int j, int n, int **board)
                {
                	//no same number in the row
                	for (int x = 0; x < 9; x++)
                	{
                		if (x != i && board[x][j] == n)
                		{
                			return false;
                		}
                	}
                	//no same number in the column
                	for (int y = 0; y < 9; y++)
                	{
                		if (y != j && board[i][y] == n)
                		{
                			return false;
                		}
                	}
                	//no same number in the square
                	if (i < 3)
                	{
                		if (j < 3)
                		{
                			//top left
                			for (int x = 0; x < 3; x++)
                			{
                				for (int y = 0; y < 3; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                		//top middle
                		else if (j < 6)
                		{
                			for (int x = 0; x < 3; x++)
                			{
                				for (int y = 3; y < 6; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                		//top right
                		else
                		{
                			for (int x = 0; x < 3; x++)
                			{
                				for (int y = 6; y < 9; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                	}
                	else if (i < 6)
                	{
                		if (j < 3)
                		{
                			//middle left
                			for (int x = 3; x < 6; x++)
                			{
                				for (int y = 0; y < 3; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                		//true middle
                		else if (j < 6)
                		{
                			for (int x = 3; x < 6; x++)
                			{
                				for (int y = 3; y < 6; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                		//middle right
                		else
                		{
                			for (int x = 3; x < 6; x++)
                			{
                				for (int y = 6; y < 9; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                	}
                	else
                	{
                		if (j < 3)
                		{
                			//bottom left
                			for (int x = 6; x < 9; x++)
                			{
                				for (int y = 0; y < 3; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                		//bottom middle
                		else if (j < 6)
                		{
                			for (int x = 6; x < 9; x++)
                			{
                				for (int y = 3; y < 6; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                		//bottom right
                		else
                		{
                			for (int x = 6; x < 6; x++)
                			{
                				for (int y = 6; y < 9; y++)
                				{
                					if (x != i && y != j && board[x][y] == n)
                					{
                						return false;
                					}
                				}
                			}
                		}
                	}
                	return true;
                }

                //maps indecies of a 2d 9x9 array to a 1d 81 array in row-major order
                int arrayIndexMap(int row, int col)
                {
                	return row * 9 + col;
                }

                //function to print the solved board to the console
                void printSolution(int **board)
                {
                	printf("============SOLUTION============\n");
                	for (int i = 0; i < 9; i++)
                	{
                		if (i == 3 || i == 6)
                		{
                			printf(" ---------+-----------+--------- \n");
                		}
                		for (int j = 0; j < 9; j++)
                		{
                			if (j == 3 || j == 6)
                			{
                				printf(" | ");
                			}
                			printf(" %d ", board[i][j]);
                		}
                		printf("\n");
                	}
                }

                void copy_board(int **b1, int **b2)
                {
                	for (int i = 0; i < 9; i++)
                	{
                		for (int j = 0; j < 9; j++)
                		{
                			b2[i][j] = b1[i][j];
                		}
                	}
                }

                //function to generate boards
                void generate_board(int **board)
                {
                	//for now, hard-coded
                	/* Start:
                		 * 2 . 7 | 9 . . | . . .
                		 * 3 4 . | 5 7 2 | 1 . .
                		 * 5 . . | . . 1 | 2 8 .
                		 * ------+-------+------
                		 * 7 . 4 | 6 . . | . . 3
                		 * . . . | 7 5 . | . 2 8
                		 * 6 . . | 1 9 . | 4 . .
                		 * ------+-------+------
                		 * . 2 . | . 4 9 | 7 . 1
                		 * . 9 5 | 8 . . | 3 4 2
                		 * . 7 3 | . 1 5 | 8 . .
                		 *
                		 * Solution:
                		 * 2 1 7 | 9 8 6 | 5 3 4
                		 * 3 4 8 | 5 7 2 | 1 9 6
                		 * 5 6 9 | 4 3 1 | 2 8 7
                		 * ------+-------+------
                		 * 7 5 4 | 6 2 8 | 9 1 3
                		 * 9 3 1 | 7 5 4 | 6 2 8
                		 * 6 8 2 | 1 9 3 | 4 7 5
                		 * ------+-------+------
                		 * 8 2 6 | 3 4 9 | 7 5 1
                		 * 1 9 5 | 8 6 7 | 3 4 2
                		 * 4 7 3 | 2 1 5 | 8 6 9
                		 */
                	board[0][0] = 2;
                	board[0][2] = 7;
                	board[0][3] = 9;
                	board[1][0] = 3;
                	board[1][1] = 4;
                	board[1][3] = 5;
                	board[1][4] = 7;
                	board[1][5] = 2;
                	board[1][6] = 1;
                	board[2][0] = 5;
                	board[2][5] = 1;
                	board[2][6] = 2;
                	board[2][7] = 8;
                	board[3][0] = 7;
                	board[3][2] = 4;
                	board[3][3] = 6;
                	board[3][8] = 3;
                	board[4][3] = 7;
                	board[4][4] = 5;
                	board[4][7] = 2;
                	board[4][8] = 8;
                	board[5][0] = 6;
                	board[5][3] = 1;
                	board[5][4] = 9;
                	board[5][6] = 4;
                	board[6][1] = 2;
                	board[6][4] = 4;
                	board[6][5] = 9;
                	board[6][6] = 7;
                	board[6][8] = 1;
                	board[7][1] = 9;
                	board[7][2] = 5;
                	board[7][3] = 8;
                	board[7][6] = 3;
                	board[7][7] = 4;
                	board[7][8] = 2;
                	board[8][1] = 7;
                	board[8][2] = 3;
                	board[8][4] = 1;
                	board[8][5] = 5;
                	board[8][6] = 8;
                }

                void generate_board_hard(int **board)
                {
                	//for now, hard-coded
                	/* Start:
                		 * . . . | . . . | . . .
                		 * . . . | . . 3 | . 8 5
                		 * . . 1 | . 2 . | . . .
                		 * ------+-------+------
                		 * . . . | 5 . 7 | . . .
                		 * . . 4 | . . . | 1 . .
                		 * . 9 . | . . . | . . .
                		 * ------+-------+------
                		 * 5 . . | . . . | . 7 3
                		 * . . 2 | . 1 . | . . .
                		 * . . . | . 4 . | . . 9
                		 *
                		 * ============SOLUTION============
                		 * 9  6  3  |  8  5  4  |  7  2  1
                		 * 4  2  7  |  9  1  3  |  6  8  5
                		 * 8  5  1  |  7  2  6  |  3  9  4
                		 * ---------+-----------+---------
                		 * 1  8  6  |  5  3  7  |  9  4  2
                		 * 7  3  4  |  2  9  8  |  1  5  6
                		 * 2  9  5  |  4  6  1  |  8  3  7
                		 * ---------+-----------+---------
                		 * 5  1  9  |  6  8  2  |  4  7  3
                		 * 3  4  2  |  1  7  9  |  5  6  8
                		 * 6  7  8  |  3  4  5  |  2  1  9
                		 */
                	board[1][5] = 3;
                	board[1][7] = 8;
                	board[1][8] = 5;

                	board[2][2] = 1;
                	board[2][4] = 2;

                	board[3][3] = 5;
                	board[3][5] = 7;

                	board[4][2] = 4;
                	board[4][6] = 1;

                	board[5][1] = 9;

                	board[6][0] = 5;
                	board[6][7] = 7;
                	board[6][8] = 3;

                	board[7][2] = 2;
                	board[7][3] = 1;

                	board[8][4] = 4;
                	board[8][8] = 9;
                }

                //recursevly solve the board
                int solve_board(int **board, int i, int j)
                {
                	//base case
                	if (i == 9)
                	{
                		//end of board, done
                		return 1;
                	}
                	if (f3)
                	{
                		//someone else is done
                		return -1;
                	}
                	int d = board[i][j];
                	//if the digit is prefilled
                	if (d != 0)
                	{
                		//printf("preallocated digit\n");
                		//check invariant, if it's been violated return -1 immediatly; something else needs to change

                		if (!verify_placement(i, j, d, board))
                		{
                			return -1;
                		}
                		//otherwise recurse
                		else
                		{
                			//if end of row
                			if (j == 8)
                			{

                				//printf("next row\n");
                				//still rows to go through
                				return solve_board(board, i + 1, 0);
                			}
                			else
                			{
                				//printf("next col\n");
                				//next column over
                				return solve_board(board, i, j + 1);
                			}
                		}
                	}
                	else
                	{
                		//if not, while you still have digits to try
                		int k = 1;
                		int m = 0;
                		while (k < 10 && m < 1)
                		{
                			//try one, if it works, recurse
                			if (verify_placement(i, j, k, board))
                			{
                				//apply working digit
                				board[i][j] = k;
                				//printf("board[%d][%d]<-%d\n", i, j, board[i][j]);
                				//recurse
                				//if end of column
                				if (j == 8)
                				{
                					//printf("next row\n");
                					//go to next row
                					m = solve_board(board, i + 1, 0);
                				}
                				else
                				{
                					//printf("next col\n");
                					//still cols to go through
                					m = solve_board(board, i, j + 1);
                				}
                			}
                			//increment k and try again
                			k++;
                		}
                		//bubble up the solution if it's been found
                		if (m == 1)
                		{
                			return 1;
                		}
                		/*you've tried all the digits
                		return your digit to zero and try again, no solution, return -1 */
                		board[i][j] = 0;
                		return -1;
                	}
                }

                //thread function, recieve jobs and try solving
                void *do_solve(void *varg)
                {
                	GM *arg = varg;
                	int pid, v1, v2, x1, x2, y1, y2;
                	int **start_board;
                	struct queue *q;

                	pid = arg->pid;
                	start_board = arg->board;
                	q = arg->q;
                	x1 = arg->x1;
                	x2 = arg->x2;
                	y1 = arg->y1;
                	y2 = arg->y2;

                	struct timespec start, end;
                	double time;
                	clock_gettime(CLOCK_MONOTONIC, &start);
                	int jobs_processed = 0;

                	//make your own board
                	int **local_board = (int **)malloc(9 * sizeof(int *));
                	for (int a = 0; a < 9; a++)
                	{
                		local_board[a] = (int *)malloc(9 * sizeof(int));
                	}

                	//p0 spawns the jobs to be placed
                	if (pid == 0)
                	{
                		for (int i = 1; i < 10; i++)
                		{
                			for (int k = 1; k < 10; k++)
                			{

                				struct job *job = malloc(sizeof(struct job));
                				job->v1 = i;
                				job->v2 = k;
                				add_work(q, job);
                				//printf("Produced: %d, %d\n", i, j);
                			}
                		}
                		//printf("Left in queue: %d\n", q->c);
                		q->all_produced = 1;
                	}
                	//all threads take jobs and evaluate
                	//printf("Starting Jobs\n");
                	copy_board(start_board, local_board);
                	int found = -1;
                	while (!f3 && (!q->all_produced || q->c))
                	{
                		//new job, reset your board
                		//printf("Copying Board\n");

                		//printf("Consumer %d reading", pid);
                		struct job *j;
                		j = get_work(q);
                		if (j == NULL)
                		{
                			continue;
                		}
                		v1 = j->v1;
                		v2 = j->v2;
                		//printf("Atempting first two placement\n");
                		//try the first two, if they violate the invariant, get a new job
                		if (verify_placement(x1, y1, v1, local_board))
                		{
                			local_board[x1][y1] = v1;
                			if (verify_placement(x2, y2, v2, local_board))
                			{
                				local_board[x2][y2] = v2;
                				//if they're fine, solve the rest of the board
                				//printf("Continuing Solver\n");
                				found = solve_board(local_board, x2, y2);
                				if (found > 0)
                				{
                					f3 = true;
                					p_found = pid;
                					break;
                				}
                				else
                				{
                					//if the first was fine, but second wasn't, reset first
                					local_board[x1][y1] = 0;
                				}
                			}
                		}
                		jobs_processed++;
                	}
                	printf("Thread %d got %d jobs\n", pid, jobs_processed);
                	clock_gettime(CLOCK_MONOTONIC, &end);
                	time = BILLION * (end.tv_sec - start.tv_sec) + (end.tv_nsec - start.tv_nsec);
                	time = time / BILLION;
                	printf("Thread %d elapsed: %lf seconds\n", pid, time);

                	pthread_exit((void *)local_board);
                }

                int ** oned_to_twod(int * board) {
                	int ** rhet = (int **)malloc(9 * sizeof(int *));
                	for (int a = 0; a < 9; a++)
                	{
                		rhet[a] = (int *)malloc(9 * sizeof(int));
                	}
                	for (int i = 0; i < 81; i++) {
                		rhet[i/9][i%9] = board[i];
                	}
                	return rhet;
                }

                int * twod_to_oned(int ** board) {
                	//printf("reconverting\n");
                	int * rhet = (int *)malloc(81 * sizeof(int));
                	for (int i = 0; i < 81; i++) {
                		 rhet[i] = board[i/9][i%9];
                	}
                	return rhet;
                }

                int* solve(int * board) {
                	//init
                	int a, b, **start_board;
                	int x1, x2, y1, y2;
                	bool f1 = false, f2 = false;
                	start_board = oned_to_twod(board);
                	printSolution(start_board);
                	//find the first two open spots on the board
                	//printf("Finding First Two Slots\n");
                	for (int i = 0; i < 9; i++)
                	{
                		for (int j = 0; j < 9; j++)
                		{
                			//printf("checking %d, %d\n", i, j);
                			if (f1 && f2)
                			{
                				break;
                			}
                			if (start_board[i][j] == 0)
                			{
                				if (f1)
                				{
                					x2 = i;
                					y2 = j;
                					f2 = true;
                				}
                				else
                				{
                					x1 = i;
                					y1 = j;
                					f1 = true;
                				}
                			}
                		}
                	}
                	if (!f1)
                	{
                		printf("Already Solved\n");
                	}
                	else if (!f2)
                	{
                		printf("Only one missing\n");
                		b = solve_board(start_board, x1, y1);
                	}
                	else
                	{
                		//printf("Starting Parallel\n");
                		//make the queue
                		struct queue *q = malloc(sizeof(struct queue));
                		q->buffer = malloc(sizeof(struct job *) * Q_SIZE);
                		q->start = 0;
                		q->end = Q_SIZE - 1;
                		q->c = 0;
                		q->all_produced = 0;

                		pthread_t *threads = malloc(4 * sizeof(threads));

                		//printf("init return structure\n");
                		void **local_boards = malloc(4 * sizeof(int **));
                		for (int i = 0; i < 4; i++)
                		{
                			local_boards[i] = NULL;
                		}
                		//printf("Spawning Threads\n");
                		for (int i = 0; i < 4; i++)
                		{
                			GM *arg = malloc(sizeof(*arg));
                			arg->q = q;
                			arg-> x1 = x1;
                			arg-> x2 = x2;
                			arg-> y1 = y1;
                			arg-> y2 = y2;
                			arg->pid = i;
                			arg->board = start_board;
                			pthread_create(&threads[i], NULL, do_solve, arg);
                		}

                		//printf("Recieving Threads\n");
                		for (int i = 0; i < 4; i++)
                		{
                			pthread_join(threads[i], &local_boards[i]);
                		}
                		start_board = (int **)local_boards[p_found];
                	}
                	//prints
                	if (b == -1)
                	{
                		printf("Exited in error\n");
                	}
                	printSolution(start_board);
                	int * rhet = twod_to_oned(start_board);
                	return rhet;
                }


                void main(int **board)
                {
                	//printf("Starting Main\n");
                	//timing setup
                	struct timespec initstart, initend;
                	struct timespec compstart, compend;
                	double inittime, comptime, totaltime;

                	/*-------------------------------------------------------
                		  Computation
                		 -------------------------------------------------------*/

                	//init
                	int a, b, **start_board;
                	int x1, x2, y1, y2;
                	bool f1 = false, f2 = false;
                	//for now, the board will be hard-coded
                	//printf("Making Start Board\n");
                	start_board = (int **)malloc(9 * sizeof(int *));
                	for (a = 0; a < 9; a++)
                	{
                		start_board[a] = (int *)malloc(9 * sizeof(int));
                	}
                	generate_board(start_board);
                	clock_gettime(CLOCK_MONOTONIC, &initstart);
                	//find the first two open spots on the board
                	//printf("Finding First Two Slots\n");
                	for (int i = 0; i < 9; i++)
                	{
                		for (int j = 0; j < 9; j++)
                		{
                			if (f1 && f2)
                			{
                				break;
                			}
                			if (start_board[i][j] == 0)
                			{
                				if (f1)
                				{
                					x2 = i;
                					y2 = j;
                					f2 = true;
                				}
                				else
                				{
                					x1 = i;
                					y1 = j;
                					f1 = true;
                				}
                			}
                		}
                	}
                	if (!f1)
                	{
                		printf("Already Solved\n");
                	}
                	else if (!f2)
                	{
                		printf("Only one missing\n");
                		b = solve_board(start_board, x1, y1);
                	}
                	else
                	{
                		//printf("Starting Parallel\n");
                		//make the queue
                		struct queue *q = malloc(sizeof(struct queue));
                		q->buffer = malloc(sizeof(struct job *) * Q_SIZE);
                		q->start = 0;
                		q->end = Q_SIZE - 1;
                		q->c = 0;
                		q->all_produced = 0;

                		pthread_t *threads = malloc(4 * sizeof(threads));

                		//printf("init return structure\n");
                		void **local_boards = malloc(4 * sizeof(int **));
                		for (int i = 0; i < 4; i++)
                		{
                			local_boards[i] = NULL;
                		}
                		clock_gettime(CLOCK_MONOTONIC, &initend);
                		clock_gettime(CLOCK_MONOTONIC, &compstart);
                		//printf("Spawning Threads\n");
                		for (int i = 0; i < 4; i++)
                		{
                			GM *arg = malloc(sizeof(*arg));
                			arg->q = q;
                			arg-> x1 = x1;
                			arg-> x2 = x2;
                			arg-> y1 = y1;
                			arg-> y2 = y2;
                			arg->pid = i;
                			arg->board = start_board;
                			pthread_create(&threads[i], NULL, do_solve, arg);
                		}

                		//printf("Recieving Threads\n");
                		for (int i = 0; i < 4; i++)
                		{
                			pthread_join(threads[i], &local_boards[i]);
                		}
                		start_board = (int **)local_boards[p_found];
                	}

                	//timing end
                	clock_gettime(CLOCK_MONOTONIC, &compend);
                	inittime = BILLION * (initend.tv_sec - initstart.tv_sec) + (initend.tv_nsec - initstart.tv_nsec);
                	inittime = inittime / BILLION;

                	comptime = BILLION * (compend.tv_sec - compstart.tv_sec) + (compend.tv_nsec - compstart.tv_nsec);
                	comptime = comptime / BILLION;
                	totaltime = inittime + comptime;

                	totaltime = comptime + inittime;

                	//prints
                	if (b == -1)
                	{
                		printf("Exited in error\n");
                	}
                	printf("Init time: %lf seconds\n", inittime);
                	printf("Comp time: %lf seconds\n", comptime);
                	printf("Total time: %lf seconds\n", totaltime);
                	printSolution(start_board);
                }